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Abstract In this paper, we propose an inexact multi-block ADMM-type first-order method for 
solving a class of high-dimensional convex composite conic optimization problems to moder¬ 
ate accuracy. The design of this method combines an inexact 2-block majorized semi-proximal 
ADMM and the recent advances in the inexact symmetric Gauss-Seidel (sGS) technique for 
solving a multi-block convex composite quadratic programming whose objective contains a non¬ 
smooth term involving only the first block-variable. One distinctive feature of our proposed 
method (the sGS-imsPADMM) is that it only needs one cycle of an inexact sGS method, in¬ 
stead of an unknown number of cycles, to solve each of the subproblems involved. With some 
simple and implementable error tolerance criteria, the cost for solving the subproblems can be 
greatly reduced, and many steps in the forward sweep of each sGS cycle can often be skipped, 
which further contributes to the efficiency of the proposed method. Global convergence as well 
as the iteration complexity in the non-ergodic sense is established. Preliminary numerical ex¬ 
periments on some high-dimensional linear and convex quadratic SDP problems with a large 
number of linear equality and inequality constraints are also provided. The results show that 
for the vast majority of the tested problems, the sGS-imsPADMM is 2 to 3 times faster than 
the directly extended multi-block ADMM with the aggressive step-length of 1.618, which is cur¬ 
rently the benchmark among first-order methods for solving multi-block linear and quadratic 
SDP problems though its convergence is not guaranteed. 
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1 Introduction 

The objective of this paper is to design an efficient first-order method for solving the following 
high-dimensional convex composite quadratic conic programming problem to moderate accuracy: 

min {^(a;) -I- \{x, Qx) + (c, x) \ Ax — b = 0, x G /C}, (1.1) 

where 9 : X ^ (—oo, -l-oo] is a closed proper convex function, Q : X ^ X is a. self-adjoint 
positive semidefinite linear operator, A : X ^ y is a linear map, c G X and b G y are given 
data, /C C T is a closed convex cone, X and y are two real finite dimensional Euclidean spaces 
each equipped with an inner product (•,•) and its induced norm |j • ||. Here, the phrase “high- 
dimension” means that the linear operators AA* and/or Q are too large to be stored explicitly 
or to be factorized by the Cholesky decomposition. By introducing a slack variable u G X, one 
can equivalently recast ( 1 . 1 ) as 

min {9{u) + ^{x, Qx) + (c, x) \ Ax — b = 0, u — x = 0,xG /C}. (1-2) 

Then, solving the dual of problem (1.2) is equivalent to solving 

min {0*(—s) -I- i(w, Qw) — {b, ^) \ s + z — Qw -I- A*^ = c, z G JC*, w G W}, (1.3) 

where W C T is any subspace containing Range(Q), 1C* is the dual cone of JC defined by 
1C* := {d G X \ {d, x) > 0 y X G 1C}, 9* is the Fenchel conjugate of the convex function 
9. Particular examples of (1.1) include convex quadratic semidefinite programming (QSDP), 
convex quadratic programming (QP), nuclear semi-norm penalized least squares and robust 
PCA (principal component analysis) problems. One may refer to [19,20] and references therein 
for a brief introduction on these examples. 

Let m and n be given nonnegative integers, Z, Xi, 1 < i < m and 3/,-, 1 < j < n, be finite 
dimensional real Euclidean spaces each endowed with an inner product (•,•) and its induced 
norm || • ||. Define X := Xi x ... x Xm and 3^ := x ... x 3^„. Problem (1.3) falls within the 
following general convex composite programming model: 

min {pi{xi) X f{xi ,..., a;„) -h qi{yi) + g{yi,. •., y„) | A*x + B*y = c}, (1.4) 

where pi : Ti —>■ (—oo, oo] and qi : 3^1 —>■ (—oo, oo] are two closed proper convex functions, 
/ : T —>■ (— 00 , 00 ) and g : y ^ (— 00 , 00 ) are continuously differentiable convex functions 
whose gradients are Lipschitz continuous. The linear mappings A : X ^ y and B : X ^ Z are 
defined such that their adjoints are given by A*x = for x = {xi,... ,Xm) G X, and 

^*V = YTj=i^*iy 3 for ?/ = ( 2 / 1 ,..., y„) e y, where A* : Xi ^ Z,i = 1,... ,m and B} : y^ -G 
Z,j = 1,... ,n are the adjoints of the linear maps At '. Z ^ Xi and Bj : Z ^ yi respectively. For 
notational convenience, in the subsequent discussions we define the functions p : X ^ (~oo, 00 ] 
and q : y ^ (— 00 , 00 ] by p{x) := pi(a:i) and q{y) := < 71 ( 2 / 1 ). For problem (1.3), one can express 
it in the generic form (1.4) by setting 

Piis) = 0*{-s), f{s,w) = \{w, Qw), qi{z) = 5 k:-{z), 
giz,0 = -{y^ 0, A*{s,w) = s - Qw and B*{z, C) = z + A*C 

There are various numerical methods available for solving problem (1.4). Among them, perhaps 
the first choice is the augmented Lagrangian method (ALM) pioneered by Hestenes [15], Powell 
[23] and Rockafellar [25], if one does not attempt to exploit the composite structure in (1.4) 
to gain computational efficiency. Let cr > 0 be a given penalty parameter. The augmented 
Lagrangian function of problem (1.4) is defined as follows: for any {x,y,z) G X x y x Z, 

Caix,y;z) :=p{x) + fix) + q{y)+g{y) + {z, A*x + B*y - c) + '^\\A*x + B*y - c]]^. 
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Given an initial point € Z, the ALM consists of the following iterations: 

{x^+\y^+^) ~ argmin £„{x,y; (1.5) 

^fc+i _p ^*/+i _ c), 

where r G (0, 2) is the step-length. A key attractive property of the ALM and its inexact variants, 
including the inexact proximal point algorithm (PPA) obeying summable error tolerance criteria 
proposed by Rockafellar [25,26] and the inexact PPAs proposed by Solodov and Svaiter [27,28, 
29] using relative error criteria, is their fast local linear convergence property when the penalty 
parameter exceeds a certain threshold. However, it is generally difficult and expensive to solve the 
inner subproblems in these methods exactly or to high accuracy, especially in high-dimensional 
settings, due to the coupled quadratic term interacting with two nonsmooth functions in the 
augmented Lagrangian function. By exploiting the composite structure of (1.4), one may use the 
block coordinate descent (BCD) method to solve the subproblems in (1.5) inexactly. However, it 
can also be expensive to adopt such a strategy as the number of BCD-type cycles needed to solve 
each subproblem to the required accuracy can be large. In addition, it is also computationally 
not economical to use the ALM during the early stage of solving problem (1.4) when the fast 
local linear convergence of ALM has not kicked in. 

A natural alternative to the ALM, for solving linearly constrained 2-block convex optimiza¬ 
tion problems such as (1.4), is the alternating direction method of multipliers (ADMM) [10, 
11,12,13], which solves x and y alternatively in a Gauss-Seidel fashion (one may refer to [7] 
for a recent tutorial on the ADMM). Computationally, such a strategy can be beneficial be¬ 
cause solving a; or 2 / by fixing the other variable in (1.5) is potentially easier than solving x 
and y simultaneously. Just as in the case for the ALM and PPAs, one may also have to solve 
the ADMM subproblems approximately. Indeed, for this purpose, Eckstein and Bertsekas [8] 
proposed the first inexact version of the ADMM based on the PPA theory and Eckstein [6] 
introduced a proximal ADMM (PADMM) to make the subproblems easier to solve. The inex¬ 
act version of Eckstein’s PADMM and its extensions can be found in [14,17,22], to name a few. 
These ADMM-type methods are very competitive for solving certain 2-block separable problems 
and they have been used frequently to generate a good initial point to warm-start the ALM. 
However, for many other cases such as the high-dimensional multi-block convex composite conic 
programming problem (1.1) and its dual (1.3), it can be very expensive to solve the ADMM 
subproblems (each is a composite problem with smooth and nonsmooth terms in two or more 
blocks of variables) to high accuracy. Also, by using BCD-type methods to solve these subprob¬ 
lems, one may still face the same drawback as in solving the ALM subproblems by requiring an 
unknown number of BCD inner iteration cycles. One strategy which may be adopted to alleviate 
the computational burden in solving the subproblems is to divide the variables in (1.4) into three 
or more blocks (depending on its composite structure), and to solve the resulting problems by a 
multi-block ADMM-type method (by directly extending the 2-block ADMM or PADMM to the 
multi-block setting). However, such a directly extended method may be non-convergent as was 
shown in [1], even if the functions / and g are separable with respect to these blocks of variables, 
despite ample numerical results showing that it is often practically efficient and effective [30]. 
Thus, different strategies are called for to deal with the numerical difficulty just mentioned. 

Our primary objective in this paper is to construct a multi-block ADMM-type method for 
solving high-dimensional multi-block convex composite optimization problems to medium ac¬ 
curacy with the essential flexibility that the inner subproblems are allowed to be solved only 
approximately. We should emphasize that the flexibility is essential because this gives us the 
freedom to solve large scale linear systems of equations (which typically arise in high-dimensional 
problems) approximately by an iterative solver such as the conjugate gradient method. With¬ 
out such a flexibility, one would be forced to modify the corresponding subproblem by adding 
an appropriately chosen “large” semi-proximal term so as to get a closed-form solution for the 
modified subproblem. But such a modification can sometimes significantly slow down the outer 
iteration as we shall see later in our numerical experiments. 
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In this paper, we achieve our goal by proposing an inexact symmetric Gauss-Seidel (sGS) 
based majorized semi-proximal ADMM (we name it as sGS-imsPADMM for easy reference) 
for solving (1.4), for which each of its subproblems only needs one cycle of an inexact sGS 
iteration instead of an unknown number of cycles. Our method is motivated by the works of 
[18] and [21] in that it is developed via a novel integration of the majorization technique used 
in [18] with the inexact symmetric Gauss-Seidel iteration technique proposed in [21] for solving 
a convex minimization problem whose objective is the sum of a multi-block quadratic function 
and a non-smooth function involving only the first block. However, non-trivially, we also design 
checkable inexact minimization criteria for solving the sPADMM subproblems while still being 
able to establish the convergence of the inexact method. Our convergence analysis relies on the 
key observation that the results in [30] and [21] are obtained via establishing the equivalence of 
their proposed algorithms to particular cases of the 2-block sPADMM in [9] with some specially 
constructed semi-proximal terms. Owing to the inexact minimization criteria, the cost for solving 
the subproblems in our proposed algorithm can greatly be reduced. For example, one can now 
solve a very large linear system of equations via an iterative solver to an appropriate accuracy 
instead of a very high accuracy as required by a method with no inexactness flexibility. 

Moreover, by using the majorization technique, the two smooth functions / and g in (1.4) 
are allowed to be non-quadratic. Thus, the proposed method is capable of dealing with even 
more problems beyond the scope of convex quadratic conic programming. The success of our 
proposed inexact sGS-based ADMM-type method would thus also meet the pressing demand 
for an efficient algorithm to find a good initial point to warm-start the augmented Lagrangian 
method so as to quickly enjoy its fast local linear convergence. 

To summarize, the main contribution of this paper is that by taking advantage of the inexact 
sGS technique in [21], we develop a simple, implementable and efficient inexact first-order algo¬ 
rithm, the sGS-imsPADMM, for solving high-dimensional multi-block convex composite conic 
optimization problems to moderate accuracy. We have also established the global convergence 
as well as the non-ergodic iteration complexity of our proposed method. Preliminary numerical 
experiments on the class of high-dimensional linear and convex quadratic SDP problems with a 
large number of linear equality and inequality constraints are also provided. The results show 
that on the average, the sGS-imsPADMM is 2 to 3 times faster than the directly extended 
multi-block ADMM even with the aggressive step-length of 1.618, which is currently the bench¬ 
mark among first-order methods for solving multi-block linear and quadratic SDPs though its 
convergence is not guaranteed. 

The remaining parts of this paper are organized as follows. In Sec. 2, we present some 
preliminary results from convex analysis. In Sec. 3, we propose an inexact two-block majorized 
sPADMM (imsPADMM), which lays the foundation for later algorithmic developments. In Sec. 
4, we give a quick review of the inexact sGS technique developed in [21] and propose our 
sGS-imsPADMM algorithm for the multi-block composite optimization problem (1.4), which 
constitutes as the main result of this paper. Moreover, we establish the relationship between 
the sGS-imsPADMM and the imsPADMM to substantially simplify the convergence analysis. 
In Sec. 5, we establish the global convergence of the imsPADMM. Hence, the convergence of 
the sGS-imsPADMM is also derived. In Sec. 6, we study the non-ergodic iteration complexity 
of the proposed algorithm. In Sec. 7, we present our numerical results, as well as some efficient 
computational techniques designed in our implementation. We conclude the paper in Sec. 8. 


2 Preliminaries 

Let lA and V be two arbitrary finite dimensional real Euclidean spaces each endowed with an 
inner product (•, •) and its induced norm || • ||. For any linear map O : 17 —>■ V, we use O* to denote 
its adjoint and ||0|| to denote its induced norm. For a self-adjoint positive semidefinite linear 
operator % ■. U ^ U, there exists a unique self-adjoint positive semidefinite linear operator, 
denoted as , such that = %. For any u,v GU, define {u,v)'h '■= {u,'Hv) and ||w||« := 

= ||'H 2 u||. Moreover, for any set U CU, we define dist(u, U) := vaiu'eu ||u — u'W and 
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denote the relative interior of U by t\(U). For any u, u', v, v' S we have 

{u,v)h = 5 (||m|!« + Ik||« - Wu-vW^) = i (||M + n||^ - llwllli - llnllli) . (2.1) 

Let 9 : U ^ {—oo, +oo] be an arbitrary closed proper convex function. We use doin0 to denote 

its effective domain and 89 to denote its subdifferential mapping. The proximal mapping of 9 
associated with H >- 0 is defined by Prox^(u) := argmin^g^^ {^(^) + ilk ~ G It 

holds [16] that 

||Prox^(t;) - Prox^(n')||^ < (n - u', Prox^(t;) - Prox^(z;')).^. (2.2) 

We say that the Slater constraint qualification (CQ) holds for problem (1.4) if it holds that 

{ (x, y) I a; € ri(domp), y G ri(dom( 7 ), A*x + B*y = c} ^ il). 

When the Slater CQ holds, we know from [24, Corollaries 28.2.2 & 28.3.1] that (x, y) is a solution 
to (1.4) if and only if there is a Lagrangian multiplier z £ Z such that (x, y, z) is a solution to 
the following Karush-Kuhn-Tucker (KKT) system 

0 G (?p(x) + V/(x) + ^ 0 , 0 G (?(?(y) + Vy(y) + A*x + B*y = c. (2.3) 

If (x,y, z) G A" X 3^ X Z satisfies (2.3), from [24, Corollary 30.5.1] we know that (x,y) is an 

optimal solution to problem (1.4) and z is an optimal solution to the dual of this problem. To 
simplify the notation, we denote w := {x,y,z) and W := X x y x Z. The solution set of the 
KKT system (2.3) for problem (1.4) is denoted by W. 


3 An Inexact Majorized sPADMM 

Since the two convex functions / and g in problem (1.4) are assumed to be continuously dif¬ 
ferentiable with Lipschitz continuous gradients, there exist two self-adjoint positive semidefinite 
linear operators Sf : X ^ X and Sg : y ^ y such that for x, x' G T and y, y' G 3^, 

fix) < 7(x;x') := fix’) + (V/(x'),x - x') -f Illx-x'll|^, 

giy) <diy,y') ■= giy') + {"^9iy'),y - y') + kWv - 

Let tJ > 0. The majorized augmented Lagrangian function of problem (1.4) is defined by, for 
any (x', y') G X x y and (x, y, z) G A x 3^ x Z, 

^aix, y; (z, x', y')) := p(x) -b fix; x') -b y(y) -b giy; y') 

+ {z, A*X B*y — c) -\- I ||A*x -b B*y — c]]^. 

Let S ■. X ^ X and T : 3^ —3^ be two self-adjoint positive semidefinite linear operators such 
that 

M := Sf+S+ aAA* y 0 and ff := Sg + T + aBB* >-0. (3.2) 

Suppose that {w^ := (x^,y^,z^)} is a sequence in A x 3^ x Z. For convenience, we define the 
two functions ipk ■ X ^ (—oo, oo] and (fk '■ y ^ (~oo, oo] by 

tpkix) :=p(x) -b l(x, Mx) - {l^, x), ifkiy) ■■= giy) + \iy, ffy) - {ly, y), 

where 

-l^ ■- V/(xk -b Az^ - Mx^ -b crA(A*x'= -b B*y^ - c), 

-Ik ■- Vy(y'=) + Bz^ - Ny^ -b ct6(A*x'=+i -b B*y’^ - c). 

Now, we are ready to present our inexact majorized sPADMM for solving problem (1.4) and 
some relevant results. 
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Algorithm imsPADMM: An inexact majorized semi-proximal ADMM for solving problem 
(1.4). 

Let T S (0, (l-|--\/5)/2) be the step-length and {efc}fc>o be a summable sequence of nonnegative 
numbers. Choose the linear operators S and T such that M > Q and A/" >- 0 in (3.2). Let 
:= (a;°, j/°,z°) G domp x domg x Z be the initial point. For k = 0,1,..., perform the 
following steps: 

Step 1. Compute and G dtpk{x'^^^) such that ||At“5d^|| < Sk and 

^k+i _ ^fc+i . ^ argmin^g;i, + ^\\x - x'^\\l} 

= aigmin^^xi’kix). 

Step 2. Compute and dy G 9(/3fc(y^+^) such that ||A/’“5d^|| < £/, and 

yk+i _ yk+i argmin^^gy {Ca{x'^+\y;w'^) + l\\y-y'"\\r} .3 

= argmiuj^gy {(pfc(y) -f {aBA*{x'^+^ - x^+^), y)}. 

Step 3. Compute z'^+^ := -h Ta{A*x'^+^ + B*y'^+^ - c). 

Proposition 3.1 Let {w*} be the sequence generated by the imsPADMM, and {x£}, {y^} be 
the sequence defined by (3.3) and (3.4). Then, for any k > 0, we have — x^'^^Wm < £fe 

and \\y^'^^ — y^'^^Wjy < {1 + a\\M~^BA* M.~^\\)ek, where M and N are defined in (3.2). 

Proof Noting that 0 G i9p(a;^+^) + — d^ and At 0, we can write = 

Prox^ (At“^(^^ + d^)). Also, we have = Prox^ . By using (2.2) we can get 

ll^fc+i _^fc+i|j^ < (a;*^+id^) = (At5(x'=+i-x'^+i), At-5d^). 

Thus, by using the Cauchy-Schwarz inequality, we can readily obtain that < 

11 At “ 2 d^ 11 < Efc. Similarly, we can obtain that 

||yfe+l _ < (yfc+1 _ dk _|_ crBA*{x'^+^ - X^~^^)) 

= (A/'3(p'=+i - p''+^), M~^dl) 

+(t(A/'5(p'‘+^ - y^+^),U~^BA*{x^^^ - x^+^)). 

This, together with ||A/’“5dy|| < Sk and \\x^^^ — < £fci implies that 

II /+1 - < W~"^dl\\ + (T|lAA-5SA*At-i 1111^'=+! - x^+^\\m 

<ek + cr||A/'“5,BA*At“5||efe, 

and this completes the proof. □ 


4 An imsPADMM with Symmetric Gauss-Seidel Iteration 


We first present some results on the one cycle inexact symmetric Gauss-Seidel (sGS) iteration 
technique introduced in [21]. Let s > 2 be a given integer and U := Ui x ■■■ x Us with each Ui 
being a finite dimensional real Euclidean space. For any u G U we write u = (ui ,... ,Us). Let 
TL : U ^ U he a given self-adjoint positive semidefinite linear operator with the following block 
decomposition: 



/TLii 'Hi2 • 

• 'His\ 




Uu := 

■17* 'U 
^12 ^22 ■ 

■ H2s 


U2 

, Uu ■■ = 


1 '17* 

\riis n.2s • 

■ Uss) 


\Us) 



nis \ 

dt[s—l)s 

0 / 


(4.1) 
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where 'Hu are self-adjoint positive definite linear operators, 'Hij '■ Uj ^ Ui, 1 < i < j < s are 
linear maps. We also define 'Hj.u := (’HuMi, ... ,'HssUs)- Note that 'H = 'Hd + 'Hu + and 'Hd 
is positive definite. To simplify later discussions, for any u GU, we denote u<i := {ui,... ,Ui}, 
u>i := i = l,...,s. We also define the self-adjoint positive semidefinite linear 

operator sGS{'H) :U^Uhy 

sGS{H) := (4.2) 

Let 9 : Ui ^ (— 00 , 00 ] be a given closed proper convex function and b G U he a given vector. 
Consider the quadratic function h :U ^ (— 00 , 00 ) defined by h{u) := | {u, 'Hu) — {b, u), Vit G U. 
Let Si,6i G Ui, i = 1,..., s be given error tolerance vectors with i5i = (5i. Define 

d(b,8)-=8HUuU2^{b-l>). (4.3) 

Suppose that u~ GU is a given vector. We want to compute 

u+ := argmin{6l(Mi) -h/i(u) -f - m”||sGS(W) “ «)}. (4.4) 

We have the following result, established by Li, Sun and Toh in [21] to generalize and refor¬ 
mulate their Schur complement based decomposition method in [20] to the inexact setting, for 
providing an equivalent implementable procedure for computing . This result is essential for 
our subsequent algorithmic developments. 

Proposition 4.1 (sGS decomposition) Assume that 'HuH = l,...,s are positive definite. 
Then 

n:='H + sGS(H) = {'Hd + 'Hu)'Hf\'Hd + K) ^ 0. 

Furthermore, for i = s, s — . ,2, define Ui by 

:= argmin„. {6 »(mC)-I-/ i(u<-_;^,Mi,M>i+i) - (5j,Uj)}. (4.5) 

Then the optimal solution u'^ defined by (4.4) can be obtained exactly via 

{ut := argmin„^ {6»(ui)-h/i(ui,u> 2 ) - (<Ji,Mi)}, 

:= argmin„. {^(Mj^)-I-/i(u|j_;^,ni,?I>i+i) - (5*,^*)}, i = 2,...,s. 

Moreover, the vector d{5,5) defined in (4.3) satisfies 

\\n--^d(6,S)\\ < WH-Hs-m + Wnli'Hd + 'Hu)-^^. (4.7) 

We should note that the above sGS decomposition theorem is valid only when the (possibly 
nonsmooth) function 9{-) is dependent solely on the first block variable ui, and it is not applicable 
if there is an additional nonsmooth convex function involving another block of variable. In the 
above proposition, one should interpret Ui in (4.5) and uf in (4.6) as approximate solutions to the 
minimization problems without the terms involving 6i and (5i.jDnce these approximate solutions 
have been computed, they would generate the error vectors <5^ and Si. With these known error 
vectors, we know that Ui and uf are actually the exact solutions to the minimization problems 
in (4.5) and (4.6). It is important for us to emphasize that when solving the subproblems in 
the forward GS sweep in (4.6) for i = 2,..., s, we may try to estimate uf by using Ui, and in 
this case the corresponding error vector 5i would be given by Si = Si + jyjJi'Hjfiu'^ — uj). In 
order to avoid solving the i-th problem in (4.6), one may accept such an approximate solution 
uf = Ui if the corresponding error vector satisfies an admissible condition such as ||(5i|| < c||5i|| 
for some constant c > 1, say c = 10. 

We now show how to apply the sGS iteration technique in Proposition 4.1 to the imsPADMM 
proposed in Sec. 3. We should note that in the imsPADMM, the main issue is how to choose 
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S and T, and how to compute and For later discussions, we use the following 

decompositions 


({Sf)ll 

{Sj)l2 ■ 



({Sy)ll 

{Sg)i2 ■ 

■ (4)i"\ 

{SfYu 

{Ef)22 ■ 

■ iEf)2m 

and 

{SMi2 

(4)22 • 

■ iA!g)2n 

\{Sf)*lra 

■ 



VSgYln 

(4)l • 

{Eg^UnJ 


for Sf and Eg, respectively, which are consistent with the decompositions of X and y. 

_ First, We choose two self-adjoint positive semidefinite linear operators Si ■. Xi ^ Xi and 
71 : 3^1 —>■ W for the purpose making the minimization subproblems involving pi and qi easier 
to solve. We need and Ti to satisfy the conditions that Mu := Si + (^/)ii -hcrSliXi 0 as 
well as Mil '■=Ti + (Eg)ii -|- crBiSi >- 0. With appropriately chosen and Ti, we can assume 
that the well-defined optimization problems 

min^,{p(xi)-b lllxiand mmy,{q{yi) + ^\\yi - 

can be solved to arbitrary accuracy for any given x'l G Xi and yj S 

Next, for i = 2,..., TO, we choose a linear operator Si > Q such that Mu := Si -I- {Ef)ii + 
aAiA* >~ 0 and similarly, for j = 2,..., n, we choose a linear operator Tj h 0 such that 
Mjj ■= fj + {Eg)jj + crBjB* y 0. 

Now, we define the linear operators 


M:=Ef+ aAA* + Diag(5i, .. .,Sm), M := Eg + cjBB* X Diag(ri, ■■■Mn). (4.8) 


Moreover, define Mu and Mu analogously as 'Ku in (4.1) for M and M , respectively, and 


Md := Diag(A^ii,...,A4„„), Md := Diag(A/'ii,..., Wn^). 


Then, M := Md + Mu+ Mu and M := Md +Mu XM*- Moreover, we define the following linear 
operators: 

S ■- Diag(5i,..., 5^) -b sGS(7W), M:=Sf + aAA* + S, 
f :=Diag(fi,...,f„)+sGS(A(') and M := Eg + aBB* + f, 

where sGS(Al) and sGS(A/') are defined as in (4.2). Define the two constants 

K := 2Vto- l||3Wd "II + ^/m||MM■Md + MuM^jj, 

k! ■■= 2y/TlMM\\My^ '^W + yjSi\\Md {Md + Mu) ^||. 


Based on the above discussions, we are ready to present the sGS-imsPADMM algorithm for 
solving problem (1.4). 
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Algorithm sGS-imsPADMM: An inexact sGS based majorized semi-proximal ADMM for 
solving problem (1.4). 

Let T G (0, (l-|--\/5)/2) be the step-length and {efc}fe>o be a summable sequence of nonnegative 
numbers. Let (x°, y^, z°) G domp x dom q x Z he the initial point. For fc = 0,1,..., perform 
the following steps: 


Step la. (Backward GS sweep) Compute for i = m,..., 2, 

« argmin^^g;t.^{£„(a:|i_i,x„x>+:J:i,y'=;u;'=) -f - xf |||}, 
e + Si{x-+^ - x^) with ||5f|| < Sk- 

Step lb. (Forward GS sweep) Compute for * = 1, ..., m, 

x,^+^ « argmin^_g;t.^{£„(x|j)|:i,x*,x>,+i,i/'=;w'=) + \\\x, - x^\\^}, 

G a 2 ;,£^(x<|l;^,x,^+\x>i+i,y'=;'u;'")+5i(x*+^ - xf) with ||(5f|| < £fc. 
Step 2a. (Backward GS sweep) Compute for j = u,..., 2, 

« argmin^,gy^{4(x'=+i,p^^._i,p„p|+ii;u;'=) + i||p, - y^\\l}, 

S dv,^a{x’^^\y<j_i,yj^^,y'^jli\w^) + fj{y^+^ - yf) with \\j^\\ < Sk- 
Step 2b. (Forward GS sweep) Compute for j = 1, ... ,n, 

« argminj^_gy^{£^(x'=+\y<+li,yj,y|+)J_i;u;'=) + ^\\yj - 
e dy^-£„(x'=+\y|+li,y^^+\y|+)J_i;w'=) +-^(y^^+^ - yf) with Ijqj^H < Sk- 
Step 3. Compute := + Ta{A*x'^+^ + B*y^+^ - c). 


For any A: > 0, and 5^ = 7^ = and 7 ^= = 

( 7 ^,... , 7 ^) such that 5^'^^ := and 7 ^^^ := 7 i^^, we define 

dl ■.= and d}; ■.=- l^)■ (4.10) 

Then, for the sCS-imsPADMM we have the following result. 

Proposition 4.2 Suppose that Aid >- 0 and JVd >- 0 for At and Af defined in (4.8). Let k and 
k' be defined as in (4.9). Then, the sequences {w^ := (x^,y^,z^)}, {5^}, {<5^}, { 7 ^} and { 7 ^} 
generated by the sGS-imsPADMM are well-defined and it holds that 

M= M + sGS{M)>~ 0, AA = A' + sCS(A') ^ 0. (4.11) 

Moreover, for any k>0,d^ and dy defined by (4.10) satisfy 


dl G d^{C>f{x^+\y>^) + illx'^+i - x'=|||), 
dl G dy(£j(x'=+\y'=+i) + \\\y^+^-y^\\\): 


(4.12) 


||At < KEfc, \\Af (4.13) 

Proof By Proposition 4.1 we can readily get (4.11). Furthermore, by Proposition 4.1 we can also 
obtain (4.12) from (4.10). By using (4.7) we can get 


\\M-^^dl\\ < \\Aid^^\\\\5’^-6^ + \\AiUAid + Aiu)-V\\ 

< {2y/m- l||At))'"|| + Vw|lA4|(Atd + At„)"^||)?fe. 


From here and (4.9), the required inequality for ||At ^d^\\ in (4.13) follows. We can prove the 
second inequality in (4.13) similarly. □ 
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Remark 4-1 (a) If in the imsPADMM, we choose S .= S,T := T, then we have M = M. > 0 and 
N = N >- Q. Moreover, we can define the sequence {sk} by Ek '■= inax{«;, K'}ek V/c > 0, so that 
the sequence {efc} is summable if {eu} is summable. Note that the sequence {w'^} generated 
by the sGS-imsPADMM always satisfies ||A^“2d^|| < and ||A/’“5(iy|| < £fe. Thus, 
can be viewed as a sequence generated by the imsPADMM with specially constructed semi- 
proximal terms. To sum up, the sGS-imsPADMM is an explicitly implementable method to 
handle high-dimensional convex composite conic optimization problems, while the imsPADMM 
has a compact formulation which can facilitate the convergence analysis of the sGS-imsPADMM. 

(b) As was discussed in the paragraph ensuing Proposition 4.1, when implementing the sGS- 
imsPADMM algorithm, we can use the computed in the backward GS sweep (Step la) to 
estimate in the forward sweep (Step lb) for z = 2,..., m. In this case, the corresponding 
error vector is given by 6^ = 5^ + ~ accept the approximate 

solution x\'^^ = x^'^^ without solving an additional subproblem if 115(^11 < e^- A similar strategy 
also applies to the subproblems in Step 2b for j = 2,..., n. 


5 Convergence Analysis 


First, we prepare some definitions and notations that will be used throughout this and the 
next sections. Since {e^} is nonnegative and summable, we can define £ := and £' := 

^k- Let {w’^ := {x^, y^, z^)} be the sequence generated by the imsPADMM and {(a;*^, y^)} 
be defined by (3.3) and (3.4). We define the mapping TZ ■. Xxy ^ Zhy Tl{x, y) := A*x-\-B*y—c, 
y{x,y) G A X 3^, and the following variables, for fc > 0, 

Aj: := x^ - := y^ - y’^+^, := 7^(x^ /), f'' := n{x’^, f), 

Al := - x'=, Al := y'^ - ?/'=, A^ := z'^ - z'=, 

~k+l ^k ^j.k+1^ ^k+l ^k T(jfk+l^ 


with the convention that x° = x° and if = ?/°. For any k> 1, by Glarke’s Mean Value Theorem 
[3, Proposition 2.6.5] there are two self-adjoint linear operators Q < Ef and Q < Eg 
such that 

V/(x'=-i) - V/(x'=) = VlAl-\ Vg{y^-^) - Vg{y’^) = (5.1) 

We define three constants a, a, j3 hy a := (1 -I- r/minjl -|- r, 1 -|- r“^})/2, 

a := 1 — a min{r, r“^}, /3 := min{l, 1 — t-|-T“^} a — (1 — Q!)t. (5.2) 

Since r G (0, (1 -I- •\/5)/2), it holds that 0<Q!<1, 0<a<l and /3 > 0. Now, we define for any 
w GW and fc > 0, 

Mw)--iy\\z-z'^f + \\x-x’^fg^_^^ + \\y-y'‘f~^^ 

+ a||7^(x, /)f + aaWr'^f + 

Mw) ■■= yL||z - + ||x - + \\y - 

Moreover, since / and g are continuously differentiable, there exist two self-adjoint positive 
semidefinite linear operators Zf<Ef and Sg < Eg such that 

f f{x) > f{x') + {Vf{x'),x-x') + b||x-x'|||.^,, Vx,x' G A, 

\ g{y) > 9{y') + {Eg{y'),y-f) + ^Wy-fWs,^ Vy,!/' G V- 

Additionally, we define the following two linear operators 

E:=^Ef+S+^E^jj[*^ g :=lEg + T + mm{T,l + T-T'^}aaBB*. (5.5) 

The following lemma will be used later. 
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Lemma 5.1 Let {afe}fe>o be a nonnegative sequence satisfying for all k > 0, 

where {efc}fc>o is a nonnegative and summable sequence of real numbers. Then the quasi-Fejer 
monotone sequence {a^} converges to a unique limit point. 

Now we start to analyze the convergence of the imsPADMM. 

Lemma 5.2 Let {ic^} be the sequence generated by the imsPADMM for solving problem (1.4). 
For any k > 1, we have 


(1 - T)a||r'=+if + (T|l7^(x'=+^?/'=)|p + - df, A 


> acr(||r^+^|p — 

'' V ''a(S,+T) 


) + /3c 


„fc +1112 






' y'' (y.{Eg-\-T)-\-m.\n{T,l-\-T — r'^}oLGBB* 


(5.6) 


Proof First note that + [1 — T)ar^. By using the fact that .y^) = 

r^+4sMS,wehave 

(1 - T)a\\r^+^f + a\\'R[x^+\y^)\\'^ 

= (2 - r)a||r'=+if + a\\B* A^f + 2{ar^+^ ,B* A^) (5.7) 

= (2-r)a||r'=+if +a||Z\^|||g. +2(l-r)CT(r^S*Z\J) +2(z'=+i 


Since Sg hPy h 0, by using (2.1) with V. := Eg + T —Vy P 0 and u = Ay v = Ay, we have 


-2(z\^-i, -liA^lli -I!z\ 


-y/E,+r-Vl - 




fe-l||2 

y "Sg+T-v^ 


From Step 2 of the imsPADMM we know that for any fc > 0, 

dy - Vg{y^) - BT'^+^ + {Eg + r)Z\^ e dq{y^+^). 

By using (5.9) twice and the maximal monotonicity of dq, we have for A: > 1, 

(d^ - 4-1 + Vg{y^-^) - Eg{y>^) - B{z'^+^ - z'^), -Zl^) 

+ (( 4 +r)(Zl^-Afc-i), _zifc)>o, 

which, together with (5.1) and (5.8), implies that 


^fe + 1 


-z^ B*A’^)-{d>^-d^y-\ AD 


> wahl 

- '' y''E„+r 




On the other hand, by using the Cauchy-Schwarz inequality we have 

> 2{r\B*Al) >-\\B*At^ - Ik'' 


(5.8) 


(5.9) 


(5.10) 


(5.11) 


Now, by applying (5.10) and (5.11) in (5.7), we can get 

(1 - r)a||r^-+if + a||7^(x'=+^ y^)f + 2(d^-i - d^ A^ 
> max{l - T, 1 - r-iM||r'=+if - ||r''f) + 

+ min{r, 1 + r - r^^iW A^f + T-^\\r’^+Y)- 


By using the Cauchy-Schwarz inequality we know that 


mx'^+\yDr = Ik'' - A*Air = ik'^ik + M*^^ik - ^tda^ad 
> ik'^ik + - 2ik''ik - iMM^ik = D\A*Air - ik'^ik, 


(5.12) 
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so that for any a S (0,1], we have 

(1 - a)((l - r)cr||r'=+i|p + a\\TZ{x'^+'^, y'^)\\'^) 

> (1 - a)((l - T)a\\r>^+T - <^\\rT + f ) (5.13) 

= (a - l)rcr||r'=+if + (1 - a)cr(||r'=+if - 

Finally by adding (5.13) to the inequality generated by multiplying a to both sides of (5.12), 
we can get (5.6). This completes the proof. □ 


Next, we shall derive an inequality which is essential for establishing both the global convergence 
and the iteration complexity of the imsPADMM. 


Proposition 5.1 Suppose that the solution set W to the KKT system (2.3) of problem (1.4) is 
nonempty. Let be the sequence generated by the imsPADMM. Then, for any w := (x, y, z) G 
W and k > 1, 


2a(d\ - d\-\ A%) - 2(dl, -X)- 2i,d\, y^^^ - y) 

+ Pxll^ + W^yWg + /3cr||r'=+if < Mw) - f^k+iiw). 


Proof For any given (x, y, z) G W, we define Xe '■= x — x, pe '■= y — y and Ze ■= z — z. Note that 

z’^ + aTZix^+\y’^) = z^+^ + aB* [y^ - y^^^). 


Then, from Step 1 of the imsPADMM, we know that 

dl - Vf(x^) - A(?'=+i + aB*A\) + {Mf + S)Al G dp{x^^^). 
Now the convexity of p implies that 

p{x) + {dl - \7f{x>^) - A(z'=+i + aB*A'^) + {Sj + S)A^, x^+^) > p(x'=+i). 

On the other hand, by using (3.1) and (5.4), we have 

f{x<^) - /(x'^+i) - (V/(a;'=), A^) > 

Thus, summing up the above two inequalities together gives 

fix) - fix^+^) + {Vfix^), > \\\xt\\% - 

By summing (5.16) and (5.17) together, we get 

pix) + fix) - pix’^+^) - fix’^+^) - (?'=+! + aB*A’;, A*x’f+^) 

+ iiSf+S)Al, + idl, > liWxlWl^ - ||A^||A ). 

Applying a similar derivation, we also get that for any y G y, 

qiy) + giy) - g(/+^) - giy^~^^) - (z'^+S B*y'f+^) 

+ iiSg+T)A>^,y'!+^) + id'^,y^,+^) > U\\ye\\l,-\\^y\\%)- 

By using (2.3), (5.4) and the convexity of the functions /, g, p and q, we have 

pix^+^) + fix^+^) - pix) - fix) + iAz, > \ \\x'l+'^\\%^, 

qiy'^^^) + giy’"^^) - qiy) - giy) + {Bz, y^+^) > \\\ye^^\\%^. 


(5.15) 


(5.16) 


(5.17) 


(5.18) 


(5.19) 


(5.20) 
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Finally, by summing (5.18), (5.19), (5.20) together, we get 


x^+i) + (d^, y^+i) - (z^+i, r'^+i) - aiJ3*A\, A*x>l+^) 
+ {Al + (Z\^, + 11 ^'"' 


nfc ||2 


lla^elli:^ + 


> 5 < 


> ilASlIl:, 


e^IlL 




i^g+r 
k+l\\2 


v»s„ 


i:, + llye+'lll:g) 


\\n{x'^+\y^)r -\\B*yir -\\r^+^r) 


Next, we estimate the left-hand side of (5.21). By using (2.1), we have 

{B*Al, A*x’l+^) = {B*yl - B*y^^+\ - B*y^+^) 

= {B*yl-B*yl-^\ r'^+i) - \{\\B*ylf - \\B*yl - B^y^+^f - Wyl+^f) 

= \{\\B*yl+^r- 

Also, from ( 2 . 1 ) we know that 

(2/e+\^y)£„+r = iU!2/eir? _, t -- llde 
Moreover, by using the definition of {z^} and (2.1) we know that 


/ Sf+s - lla;. 


k+l\\2 \ _ 1 ||/\fe||2 

Sf+S> 

,/c+l||2 i — ill Ak\\2 

S,+T> 


) - -IIA* 

-) 2 ll^yni;+r' 




2t<j ^ 


- |U'"+^ - Z^P - 


zl A crr'=+i) = - z^ z^) A cr||r'' 

fe||2^ _L „\\^k+l\\2 


fc +1 II 2 


2 )-kcr||r'' „ 

Thus, by using (5.22), (5.23) and (5.24) in (5.21), we obtain that 

(d^, x^i) + (dj, y'l+p A ^iWz’^r - ) + %{\\B*y^A^ - Wy'^+^f) 




Id, 


fe+l II 2 




> ill^: 


fc ||2 


+ ^I 1 A 


fc ||2 

y'ASn+T 


+ f ||7^(x'=+^y'=)f + . 


(5.21) 


(5.22) 


(5.23) 


(5.24) 


(5.25) 


Note that for any y G y, TZ{x,y) = B*ye- Therefore, by applying (5.6) to the right hand side 
of (5.25) and using (5.3) together with (5.5), we know that (5.14) holds for fc > 1 and this 
completes the proof. □ 


Now, we are ready to present the convergence theorem of the imsPADMM. 

Theorem 5.1 Suppose that the solution set W to the KKT system (2.3) of problem (1.4) is 
nonempty and {rc^} is generated by the imsPADMM. Assume that 

Sf 4-5-1- aAA* A 0 and Dg aT A uBB* A 0. (5.26) 

Then, the linear operators T and Q defined in (5.5) are positive definite. Moreover, the sequence 
{w^} converges to a point in W. 

Proof Denote p := min(T, 1 4- r — r^) G (0,1]. Since a > 0, from the definitions of P and Q, we 
know that 

P = (27/ -4 5 4- cjAA*) a %Sf A A 0, 

g = P^{SgAT A aBB*) A ^Sg A A ^aBB* A 0. 

Now we start to prove the convergence of the sequence {rc^}. We first show that this sequence is 
bounded. Let w := {x, y, z) be an arbitrary vector in W. For any given (x, y, z) G W, we define 
Xe ■= X — X, ye '■= y — y and Ze ■= z — z. Since ^ 0, it holds that 

\\Ay% A 2a{d% - d%-\ A^) = ||A^ + aQ-pd^ - d^-i)||| - a^ljd^ - d>;-%.^. (5.27) 








14 


Liang Chen et al. 


By substituting and for and in (5.14), we obtain that 


(j)k{w) - (j)k+i{w) + a^Wdl 
> — x^Wjr + /3cr|if^+^||^ + — y^ + aQ~^d, 


-l^fe-l||2^ 


Define the sequences and inZxdfxJ^xZxJ^for/c^lby 

f := (v^4':(^/+‘5)L*,A/'^^,v^r'=,v^(rg + r)5(Z\^"^)), 

1 P := {Ef + 5)^x1,M^y’^,V^f'^,y/a{Sg+T)^{y'^~^ - y^)). 


(5.28) 


(5.29) 


Obviously we have ||Cfe|r = (j^kiw) and ||^fc|r = 4’kiw), which, together with (5.28) implies that 
||^fc+i ||2 £ + a'^\\G~^dy~^\\'^. As a result, it holds that < ||C^|| + a\\Q~^dy~^\\. 

Therefore, we have that 


< lie'll!++ Ilf- f+i. 

Next, we estimate ||f — f ^^|| in (5.30). Since 3 + r S [1, 2], it holds that 

;^||A^+^|P + 3 cr|jr^“''^ — r^+^lp = (t + a)(T|]f^+^ — r^+^|p 
< 2ct||A*A^+i +S*Afc+i||2 < 4||A^+i||2^^. +4||A^+i||2gg., 

which, together with Proposition 3.1, implies that 

Ilf+'-f+'f 


(5.30) 


< Ux'^"\\%+s + 'I 


2 

^q+r 




■4pfc+i||2 


BB* 


(5.31) 


<5i\\At+TM + \\4^T^)<d^4, 

where p is a constant defined by 

Q := 5(1 + (1 + tT||A/'“5S^*Al“511)^). 


(5.32) 


On the other hand, from Proposition 3.1, we know ||^ 2 d^|| < \\Q ||ej,. By using this fact 

together with (5.30) and (5.31), we have 


llf+'ll < Ilf II + QSk + \\g-^^j^^\sk-i < Ilf II + (p+ if-^AffDf 


(5.33) 


which implies that the sequence {f} is bounded. Then, by (5.31) we know that the sequence 
{f} is also bounded. From the definition of f we know that the sequences {y^}, {r^} and 

{{Ef + 5 ) 23 ;:^} are bounded. Thus, by the definition of r^, we know that the sequence {Ax’^} is 
also bounded, which together with the definition of A4 and the fact that Al 0, implies that 
{a;^} is bounded. 

By (5.28), (5.31) and (5.33) we have that 

EZi (If- A\^ + + Ilf- f + aG-^dlA\l) 

< YALi {4)k{w) - (j)k+i{w) + (t>k+i{w) - 0fc+i(w) + a^lMf ^lle-i) 

< ^i(^) + Er=illf- f+'ll(llf+'ll + Ilf+'ll) + Iff £' (5.34) 

< (jiiA) + \\g~^N^\\^£' + ymax{||f+^|| + ||f+^||}3 < 00 , 

A :>1 


where we have used the fact that 4>k{w) — <l>k{w) < ||f — f IKIlf II + Ilf ID- Thus, by (5.34) 
we know that {||x^+^ — a;^||3r} —>■ 0, {lly^"*"^ — y^ + ^||g} —>■ 0, and } —t 0 

as /c —>■ 00 . Since we have A 0 and t/ 0 by (5.26), {5;^+^ — x^} 0, {y^'*'^ — y^} —>■ 0 

and {f} —?> 0 as fc —>■ 00 . Also, since Al >- 0 and Af 0, by Proposition 3.1 we know that 
{x^ — x^} 0 and {y^ — y^} —>■ 0 as A: —>■ 00 . As a result, it holds that {A^} —>• 0, {Ay} —>• 0, 

and {r^} —>• 0 as fc —>• 00 . Note that the sequence {(a;^+^, y^+^, z^+^)} is bounded. Thus, it has 
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a convergent subsequence which converges to a point, say {x°°, y°°, z°°). 

We define two nonlinear mappings F : W —>■ X and G : W —>■ Z by 

F{w) := dp{x) + yfix) + Az, G{w) := dqiy) + Vg(?/) + Bz,\/w G W. (5.35) 

From (5.15), (5.9) and (5.1) we know that in the imsPADMM, for any /c > 1, it holds that 

(4- + [Ef + S)Al + (r - l)aAr^+^ - G F(n;'=+i), ^ 

[4- V^+^Al + (Eg + T)A^ + (t- l)aBA+^ G G(i(;*^+i). 

Thus by taking limits along {ki} as z —>■ oo in (5.36), we know that 

0€dpix°°)+Vfix°°)+Az°°, and 0 € dqiy°°)+Vgiy°°) + Bz°°, 

which together with the fact that limfe_>oo = 0 implies that {x°°, y°°, z°°) G W. Hence, 
(x°°,y°°) is a solution to the problem (1.4) and z°° is a solution to the dual of problem (1.4). 

By (5.33) and Lemma 5.1, we know that the sequence {||C^||} is convergent. We can let w = 
{x°°, y°°, z°°) in all the previous discussions. Hence, lim/j_,.oo 4 = 0- Thus, from the definition of 
we know that limfc_>oo = z°°, linifc_>oo 2/^ = 2/°° and limfc_).oo(^/ + S)x'^ = (Ef + S)x°°. 
Obviously, since limfc_j.oo = 0, it holds that {A*x’^} —)■ A*x°° as fc —)• oo. Finally, we get 
limfc_>.oo x^ = x°° by the definition of Ai and the fact that A1 0. This completes the proof. □ 


6 Non-Ergodic Iteration Complexity 

In this section we establish an iteration complexity result in the non-ergodic sense for the 
imsPADMM. The definitions and notations in Sec. 5 also apply to this section. We define the 
function D : W —>■ [0, oo) by 

D(w) := dist^(0, F(w)) + dist^(0, G{w)) + \\Tl(x, y)4, Vw = (x, y, z) G W, 

where F and G are defined in (5.35). We say that w G W is an e-approximation solution of 
problem (1.4) if D{w) < e. Our iteration complexity result is established based on the KKT 
optimality condition in the sense that we can find a point w GW such that D(w) < o(l/A:) after 
k steps. The following lemma will be needed later. 

Lemma 6.1 If {ui} is a nonnegative sequence satisfying then we have min {ai} < 

l<i<k 

a/k and lim \k ■ min {oi}} = 0. 

k-^-oo l<2<fe 

Now we establish a non-ergodic iteration complexity for the imsPADMM. 

Theorem 6.1 Suppose that all the assumptions of Theorem 1 hold and the convergent sequence 
{w^} generated by the imsPADMM converges to the limit w := {x,y,z)- Then, there exists a 
constant u; > 0 such that 

min < uj/k, and lim jfc min {D(r(;*“'"^)}} = 0. (6-1) 

l<i<k k—¥cc) l<2<fc 

Proof For any (x,y, z) G X x y x Z, define Xe ■= x — x, ye ■= y — y and Ze = z — z. Moreover, 
we define the sequence {Cfc}fc>i by 

a := + 2{dl,y4^) + a^dl - d^Yg-^)- 

We first show that {Cfc} is a bounded sequence. Let {4} be the sequence defined in (5.29). 
Define g := ||^^|| + (g + \\G~^.N'^\\')8, where g is defined by (5.32). From (5.29) and (5.33), we 
know that for any z > 1, 

+ + < IlG'f < kk 


( 6 . 2 ) 
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where a is defined in (5.2). We then obtain that 

<2a||r*+if+ 2||y*+i||2gg. < |(aa)||r*+if + 2|!y*+i||^. (6.3) 

By using (6.2) and (6.3) together, we get 

< ||4’^^|||^,_^_5 + i(Scr)||r*+i|p + 2||y*+i||^ < 2g^/a. (6.4) 

We can see from (6.2) that Hyg^^llA^ < Q- This, together with (6.4) and the fact that 0 < S < 1, 
implies 

+ ('^r2/e^^)l < {\/^+l)Qek- (6.5) 

Note that ||f/“5dy|| < \\Q~^.N'^\\£k and 0 < a < 1. Thus, we have 

(6.6) 

Therefore, by combining (6.5) and (6.6), we can get 

C/c < E“i (2|(4>4’^^) + {<^y,yi^^)\+a^\\dl - 
<c := 2 (v^+l)ef + 4||g-5Af5f£:', 

where £ and £' are defined in the beginning of Sec. 5. By using (5.14), (5.27) and (6.7), we have 
that 

Er=i + /3a|lr'=+if + |1Z\^ - ag-\d^^ - 4-1)11^ 

< Er=i(2|(4, 4+')l +2|(4, 4+')l +«"II4 - 4”'lle-0 (6-8) 

+ Efci(4(w) - 4)k+i{w)) < ())i(u}) + c, 

where C is defined in (6.7). Also, since 0 < a < 1, we have that 

PS + ag-^d'i - dl-^)\\l > \\A% - 2PSIIIMS - 

By (6.2) we know that — Q- Hence, P^H < 2||A/’“5p. From the fact that IjdSH < 

IIAfs ||efe, we can get ||(dS~4~^)ll — IPs || (efc+Cfc-i). Thus from (6.8) and the above discussions, 
we have that 

Er=i(psp+PSP) 

< max {1, \\MF-% ||Aft;-i||} EPi (PSP + + \\A%) 

< Wo := max{l, \\MJ^-^\\,\\Mg-^\\} {(j)i{iv) + ( + ^^f^\\\^f~^\g£) ■ 

Next, we estimate the value of From the fact that A4 ^ V!^ and (5.36), we obtain that 

dist^(0,F(P+i)) 

< 114 + (p + 5 - V^+^)Al + (r - l)aAr^+^ - 

< 114 + {M- V^+^)Al + (tA(P+i - r'=)||2 

< i\\M\\{\\M-Uir + PSP + a^lp-UlPllP+i - ). 

Also, from the fact that M >Vy and (5.36), we have 
dist^(0,G'(P+i)) 

< 114 + (Ag + r - V^+^)Al + {t- l)(jBr^+^f 

< 114 + {N-V^+^)Al + ctH(P+i - rp 

< 3IPII (|pP4f + PSP + 2a2||Af-pf IIP+I - ) 
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Define uji := ^‘^ (11-^ 2 ^|p + HA/” 2 ^p)and 

u }2 ■= max {IIAdII + 2cr^||;B^*Ad“5 1 | 2 ^ || A/’ll,dwi}. 

It is now easy to verify from the above discussions that 

EZi D{w'^+^) <u;:= 3(||Af |1 + ||AA||)£:' + 2a;i||ri|| + 3a;occ2. 

Therefore, by the above inequality and Lemma 6.1, we know that (6.1) holds with w > 0 being 
defined above. This completes the proof. □ 

Remark 6.1 (a) We note that the sequence {D{w’^)} is not necessarily monotonically decreasing, 
especially due to the inexact setting of the imsPADMM. Thus it is not surprising that the 
iteration complexity result is established with the “mini<i<fc” operation in Theorem 6.1. 

(b) For a majorized ADMM with coupled objective, the non-ergodic complexity analysis was 
first proposed by Cui et al. [4]. For the classic ADMM with separable objective functions, Davis 
and Yin [5] provided non-ergodic iteration complexity results in terms of the primal feasibility 
and the objective functions. One may refer to [4, Remark 4.3] for a discussion on this topic. 


7 Numerical Experiments 

In this section, we report the numerical performance of the sGS-imsPADMM for the following 
rather general convex QSDP (including SDP) problems 

min { i (A, QX) + (C, X) \ AeX = bs, AjX > 6/, A e 5)1 n A^} (7.1) 

where 5" is the cone of n x n symmetric positive semidefinite matrices in the space of n x n 
symmetric matrices 5", Q : 5" —)■ 5" is a self-adjoint positive semidefinite linear operator, 
Ab : 5" ^ SR"*® and A/ : 5” ^ 3?™^ are linear maps, C S 5", bs € 3?™® and bi G 3?""^ are 
given data, Af is a nonempty simple closed convex set, e.g., Af = {A G \ L < X < U} with 
L,U G being given matrices. The dual of problem (7.1) is given by 

- min S^{-Z) + \ {W, QW) - {bs, Ve) - (&/, Vi) 2 ) 

s.t. Z-QW + SA A^eVe + A*iyi = C, S G SI, yi >Q, W GW, 

where W is any subspace in 5" containing Range(Q). Typically W is chosen to be either 5” or 
Range(Q). Here we fix W = 5". In order to handle the equality and inequality constraints in 
(7.2) simultaneously, we add a slack variable v to get the following equivalent problem: 

max ( - 5lf{-Z) - {v)) - ^ {W, QW) - (5s- {S) A {bE, yE) + {bi, yi) 

s.t. Z-QW + S + A*EyE + A}yi = C, V{v -yi)=D, Wg W, 

where V G 3 ?'” 2 xm 7 jg ^ positive definite diagonal matrix introduced for the purpose of scaling 
the variables. The convex QSDP problem (7.1) is solved via its dual (7.3) and we use A G 5" 
and u G 3?™^ to denote the Lagrange multipliers corresponding to the two groups of equality 
constraints in (7.3), respectively. Note that if Q is vacuous, then (7.1) reduces to the following 
general linear SDP: 


min { {C, X) I AeX = bs, AiX >bi, A G 5^ n N}. (7.4) 

Its dual can be written as in (7.3) as follows: 

min {5lf{-Z) + (u)) -k (5s- (5') - {bE, yE) - {h, yi) ^ 

s.t. Z S -\- A*EyE + •^*iyi = C, 'D{v ~ yi) = 0, 
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or equivalently, 

min + 5^-r^i (yi)) + 6s:^ (S) - {bs, ys) - {bi, yi) 

s.t. Z + S + A%yE + A}yi = C. ^ ’ 

Denote the normal cone of A/” at X by Njy{X). The KKT system of problem (7.1) is given by 

r A*EyE + A*iyi + S + Z-QW-C = 0, AeX - = 0, 

\ 0 e Nm{X) + Z, QX - QW = Q, X asi, s e {X, S) = O, (7.7) 

[ AiX - &/ > 0, yi > 0, {Aix - b, yi) = 0. 

Based on the optimality condition (7.7), we measure the accuracy of a computed solution 
{X, Z,W, S,yE,yi) for QSDP (7.1) and its dual (7.2) via 


??qsdp = T^ax{r]D,Vx,Vz,VP,Vw,VS,Vi}, 


(7.8) 


where 


ID 


VP 


\\A%yE+A‘ryi+S+Z-QW-C\\ _ \\X^JI^jjX)\[ _ \\X-n^(X-Z)\\ 

l + lldl > - 1+IIXII ’ VZ - i + ||x|| + ||Z|| ’ 


\\AEX — bE II 

i+II&eII ’ 


Vw 


\\QX-QW\\ 

1+IICII 


, r]s = max { 


||Js:-77s™(X)|| |(X,S)| 1 

1+llxii : 1+||X||+||S||D 


VI 


= max I 


II min(0,1/7)11 

i+lly/ll 


II min(0,.4jX-bj)|| \{Ai X-bj ,yi)\ I 
l+l|b/ll ’ l + ll^/x—6/|| + ||yj-|| J 


In addition, we also measure the objective value and the duality gap: 

Objprinial := QX) + {C,X), 

Objduai := -S%-{-Z) - \ {W, QW) + (be, ys) + (6/, yi), 

. Ohjppinia.1 Ohjdual 

•“ l + |Objprinral| + |Objdual| * 


(7.9) 


The accuracy of a computed solution {X, Z, S, yE, yi) for the SDP (7.4) is measured by a relative 
residual rjs&p similar to the one defined in (7.8) but with Q vacuous. 

Before we report our numerical results, we first present some numerical techniques needed 
for the efficient implementations of our algorithm. 


7.1 On Solving Subproblems Involving Large Linear Systems of Equations 

In the course of applying ADMM-type methods to solve (7.3), we often have to solve a large 
linear system of equations. For example, for the subproblem corresponding to the block y/, the 
following subproblem with/without semi-proximal term has to be solved: 

min{ - (5/, yi) + f || [A/,-P]*?// - rf \ \\yi-yj\\r], (7-10) 

where T is a self-adjoint positive semidefinite linear operator on , and r and yj are given 
data. Note that solving (7.10) is equivalent to solving 

{a{AiA*j + V^) +r)yi = 7~bi + a{Ai,-V)r + TyJ. (7.11) 

It is generally very difficult to compute the solution of (7.11) exactly for large scale problems 
if T is the zero operator, i.e., not adding a proximal term. We now provide an approach for 
choosing the proximal term T, which is based on a technique^ proposed by Sun et al. [31], so 
that the problem can be efficiently solved exactly while ensuring that T is not too “large”. Let 

V ■.= AiA*j+V^. 

^ This technique is originally designed for choosing the preconditioner for the preconditioned conjugate gradi- 
ent method when solving large-scale linear systems, but it can be directly applied to choosing the semi-proximal 
terms when solving subproblems in ALM-type or ADMM-type methods. One may refer to [31, Section 4.1] for 
the details. 
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Suppose that V admits the eigenvalue decomposition V = ^ with Ai > ... > A„ > 0. 

We can choose T by using the first I largest eigenvalues and the corresponding eigenvectors of 
V. By following the procedure provided in [31] we have 

T=aj:ti+A^i+i-X.)V.V*. (7.12) 

Thus T is self-adjoint positive semidefinite. Moreover, it is more likely that such a T is “smaller” 
than the natural choice of setting it to be cr(AiX— V). Indeed we have observed in our numerical 
experiments that the latter choice always leads to more iterations compared to the choice in 
(7.12). 

To solve (7.11), we need to compute (ctV -I- T) which can be obtained analytically as 
(crV -I- T)~^ = (ctA;+i)“^X -I- ~ ■ Thus, we only need to calculate 

the first few largest eigenvalues and the corresponding eigenvectors of V and this can be done 
efficiently via variants of the Lanczos method. Finally, we add that when the problem (7.10) is 
allowed to be solved inexactly, we can set T = 0 in (7.10) and solve the linear system tjV = r by 
a preconditioned conjugate gradient (PCG) method. In this setting, (aV + T)~^ with T dehned 
in (7.12) can serve as an effective preconditioner. 

Note that we can apply similar techniques to solve large linear systems of equations arising 
from solving the subproblems corresponding to W in (7.3). 


7.2 Numerical Results for Quadratic/Linear SDP Problems 


In our numerical experiments, we construct QSDP test instances based on the doubly nonnega¬ 
tive SDP problems arising from relaxation of binary integer quadratic (BIQ) programming with 
a large number of inequality constraints that was introduced by Sun et al. [30] for getting tighter 
bounds. The problems that we actually solve have the following form: 


min A(X, QX) + ^{Q, X) + (c, x) 


s.t. diag(X) — a; = 0, X = 


X X 
x'^ 1 


G 5(1, X G A7 := {X G 5" : X > 0}, 


Xi — Xij > 0, Xj — Xij > 0, Xij — Xi — Xj > —1, Vl<i<j<n — 1. 


For convenience, we call them as QSDP-BIQ problems. When Q is vacuous, we call the corre¬ 
sponding linear SDP problems as SDP-BIQ problems. The test data for Q and c are taken from 
the Biq Mac Library maintained by Wiegele^. 

We tested one group of SDP-BIQ problems and three groups of QSDP-BIQ problems with 
each group consisting of 80 instances with n ranging from 151 to 501. We compare the perfor¬ 
mance of the sGS-imsPADMM to the directly extended multi-block sPADMM with the aggres¬ 
sive step-length of 1.618 on solving these SDP/QSDP-BIQ problems. We should mention that, 
although its convergence is not guaranteed, such a directly extended sPADMM is currently 
more or less the benchmark among first-order methods targeting to solve multi-block linear 
and quadratic SDPs to modest accuracy. Note that for QSDP/SDP problems, the majorization 
step is not necessary, so we shall henceforth call the sGS-imsPADMM as sGS-isPADMM. We 
have implemented our sGS-isPADMM and the directly extended sPADMM in Matlab. All the 
320 problems are tested on a HP Elitedesk with one Intel Gore i7-4770S Processor (4 Gores, 8 
Threads, 8M Gache, 3.1 to 3.9 GHz) and 8 GB RAM. We solve the QSDP (7.1) and the SDP 
(7.4) via their duals (7.3) and (7.5), respectively, where we set V := al with a = y^||AI/||/ 2. 
We adopt a similar strategy used in [30,20] to adjust the step-length t The sequence {ek}k>o 


^ http://biqmac.uni-klu.ac.at/biqmaclib.html 

® If T S cxd) but it holds that < oo, the imsPADMM (or the sGS-imsPADMM) 

is also convergent. By making minor modifications to the proofs in Sec. 5 and using the fact that IMyll < 

we can get this convergence result with ease. We omit the detailed proof to reduce the length of this paper and 
one may refer to [2, Theorem 1] for such a result and its detailed proof for the sPADMM setting. During our 
numerical implementation we always use r £ [1.618,1.95]. 






20 


Liang Chen et al. 


that we used in imspADMM is chosen such that e/j < The maximum iteration number 

is set at 200,000. 

We compare sGS-isPADMM applied to (7.3) with a 5-block directly extended sPADMM 
(called sPADMM5d) with step-length of 1.618 applied on (7.2), and compare sGS-isPADMM 
applied to (7.5) with a 4-block directly extended sPADMM (called sPADMM4d) provided by 
[30] with step-length"^ of 1.618 on (7.6). For the comparison between sPADMM4d and some other 
ADMM-type methods in solving linear SDP problems, one may refer to [30] for the details. For 
the sGS-isPADMM applied to (7.3), the subproblems corresponding to the blocks {Z,v) and S 
can be solved analytically by computing the projections onto N x 3?™^ and 5", respectively. For 
the subproblems corresponding to ue, we solve the linear system of equations involving the coef¬ 
ficient matrix Ae-A.% via its Gholesky factorization since this computation can be done without 
incurring excessive cost and memory. For the subproblems corresponding to yj and W, we need 
to solve very large scale linear systems of equations and they are solved via a preconditioned 
conjugate gradient (PGG) method with preconditioners that are described in the previous sub¬ 
section. In the implementation of the sGS-isPADMM, we have used the strategy described in 
Remark 4.1 (b) to decide whether the quadratic subproblems in each of the forward GS sweeps 
should be solved. In our numerical experiments, we have found that very often, the quadratic 
subproblems in the forward sweep actually need not be solved as the solutions computed in 
the prior backward sweep already are good approximate solutions to those subproblems. Such a 
strategy, which is the consequence of the flexibility allowed by the inexact minimization criteria 
in sGS-isPADMM, can offer significant computational savings especially when the subproblems 
have to be solved by a Krylov iterative solver such as the PGG method. We note that in the 
event when a quadratic subproblem in the forward or backward sweep has to be solved by a 
PGG method, the solution computed in the prior sweep or cycle should be used to serve as a 
good initial starting point for the PGG method. 

For the sPADMMSd applied to (7.2), the subproblems involving the blocks Z, S and yE 
can be solved just as in the case of the sGS-isPADMM. For the subproblems corresponding 
to the nonsmooth block y/, since these subproblems must be solved exactly, a proximal term 
whose Hessian is Amax^ — crAjA} (with A^ax being the largest eigenvalue of aAiA}) has to be 
added to ensure that an exact solution can be computed efficiently. Besides, we can also apply 
a directly extended 5-block sPADMM (we call it sPADMM5d-2 for convenience) on (7.3). In 
this case, we can use the proximal term described in (7.12) in the previous subsection, where I 
is typically chosen to be less than 10. We always choose a similar proximal term when solving 
the subproblems corresponding to W for both the sPADMMSd and the sPADMM5d-2. Since 
the performance of the sPADMM5d-2 applied to (7.3) is very similar to that of the sPADMM5d 
applied to (7.2), we only report our numerical results for the latter. 

We now report our numerical results. Figure 1 shows the numerical performance of the sGS- 
isPADMM and sPADMM4d in solving SDP-BIQ problems to the accuracy of 10“® in rjqsdp- One 
can observe that sGS-isPADMM is 3 to 5 times faster than the sPADMM4d, on approximately 
80% problems in terms of computational time. Figure 2 shows the numerical performance of 
the sGS-isPADMM and sPADMM5d in solving QSDP-BIQ problems (group 1) to the accuracy 
of 10“® in rjqsdp- For this group of tested instances, Q is chosen as the symmetrized Kronecker 
operator Q{X) = ^{AXB + BXA), as was used in [34], with A, B being randomly generated 
symmetric positive semidefinite matrices such that rank(A) = 10 and rank(i?) « n/10. Clearly 
Q is self-adjoint and positive semidefinite on 5" but highly rank deficient [33, Appendix]. As 
is showed in Figure 2, the sGS-isPADMM is at least 2 times faster than the sPADMM5d, 
on solving the vast majority the tested problems in terms of computational time. Figure 3 
demonstrates the numerical performance of the sGS-isPADMM and sPADMM5d in solving 
QSDP-BIQ problems (group 2) to the accuracy of 10“® in r]qsdp- Here, Q is chosen as the 
symmetrized Kronecker operator Q{X) = ^{AXB + BXA) with A, B being matrices truncated 
from two different large correlation matrices (Russell 1000 and Russell 2000) fetched from 


^ We did not test the sPADMM4d with r = 1 as it has been verified in [30] that it almost always takes 20% 
to 50% more time than the one with r = 1.618. 
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Performance profile: iterations Performance profile: time 




Fig. 1 Performance profiles of sGS-isPADMM and sPADMM4d on solving SDP-BIQ problems. 


Performance profile: iterations Performance profile: time 




Fig. 2 Performance profiles of sGS-isPADMM and sPADMM5d on solving QSDP-BIQ problems (group 1). 


Yahoo Finance by Matlab. As can be seen from Figure 3, sGS-isPADMM is 2 to 3 times faster 
than the sPADMMSd for about 80% of the problems in terms of computational time. Figure 4 
shows the numerical performance of the sGS-isPADMM and sPADMMSd in solving QSDP-BIQ 
problems (group 3) to the accuracy of 10“® in rjgsdp- Here, Q is chosen as the Lyapunov operator 
Q{X) = \{AX + XA) with A being a randomly generated symmetric positive semidefinite 
matrix such that rank(A) « n/10. We should note that for these instances, the quadratic 
subproblems corresponding to W in both the sGS-isPADMM and ADMMSd can admit closed- 
form solutions by using the eigenvalue decomposition of A and adapting the technique in [32, 
Section 5] to compute them. In our numerical test, we compute the closed-form solutions for 
these subproblems. As can be seen from Figure 4, the sGS-isPADMM is 2 to 4 times faster than 
the sPADMM5d for more than 90% tested instances in terms of computational time. Table 1 
gives the detailed computational results for selected tested instances with n = 501. The full 
table for all the 320 problems is available at the authors’ website®. 

To summarize, we have seen that our sGS-isPADMM is typically 2 to 3 times faster than 
the directly extended multi-block sPADMM even with the aggressive step-length of 1.618. We 
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http: //www. math, nus .edu. sg/''inattohkc/publist .html 
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Performance profile: iterations Performance profile: time 




Fig. 3 Performance profiles of sGS-isPADMM and sPADMMSd on solving QSDP-BIQ problems (group 2). 

Performance profile: iterations Performance profile: time 




Fig. 4 Performance profiles of sGS-isPADMM and sPADMMSd on solving QSDP-BIQ problems (group 3). 


achieve this by exploiting the flexibility allowed by our proposed method for only requiring 
approximate solutions to the subproblems in each iteration. In contrast, the directly extended 
sPADMM must solve the subproblems exactly, and hence is forced to add appropriate proximal 
terms which may slow down the convergence. Indeed, we observed that its convergence is often 
much slower than that of the sGS-isPADMM. The merit that is brought about by solving the 
original subproblems inexactly without adding proximal terms is thus evidently clear. 


8 Conclusions 

In this paper, by combining an inexact 2-block majorized sPADMM and the recent advances 
in the inexact symmetric Gauss-Seidel (sGS) technique for solving a multi-block convex com¬ 
posite quadratic programming whose objective contains a nonsmooth term involving only the 
first block-variable, we have proposed an inexact multi-block ADMM-type method (called the 
sGS-imsPADMM) for solving general high-dimensional convex composite conic optimization 
problems to moderate accuracy. One of the most attractive features of our proposed method 
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Table 1 The numerical performance of sGS-isPADMM and the directly extended multi-block ADMM with 
step-length r = 1.618 on 1 group of SDP-BIQ problems and 3 groups of QSDP-BIQ problems with n > 500 
(accuracy = 10“®) 


Problem 

mE\ fyii 

Us 

Iteration 

sGS-isP|sP-d 

'Hgsdp 

sGS-isP|sP-d 

Vgap 

sGS-isP|sP-d 

Time 

sGS-isP|sP-d 

SDP-BIQ 

bqp500-2 

501;374250 

501 

17525|82401 

9.9-7|9.9-7 

-6.3-7|2.3-8 

42:27|2:12:29 

bqp500-4 

501;374250 

501 

15352 75995 

9.9-7 9.9-7 

-6.4-7 -3.2-8 

36:53 1:59:52 

bqp500-6 

501;374250 

501 

17747 78119 

9.9-7 9.9-7 

-1.6-7 -2.4-8 

45:10|2:04:23 

bqp500-S 

501;374250 

501 

20386 110825 

9.9-7 9.9-7 

-4.3-7 2.1-8 

52:04 3:10:43 

bqp500-10 

501;374250 

501 

16407 68985 

9.7-7 9.9-7 

-5.6-7 3.7-9 

39:30 1:46:01 

gkalf 

501;374250 

501 

9101|60073 

9.9-7 9.9-7 

-4.4-7 1.1-8 

20:22|1:32:22 

gka2f 

501;374250 

501 

16193174034 

9.9-7 9.9-7 

-2.7-71-1.1-8 

39:35 1:59:59 

gkaSf 

501;374250 

501 

16323 72563 

9.9-7 9.9-7 

-1.3-7 3.9-8 

40:38 1:56:28 

gka4f 

501;374250 

501 

15502 63285 

9.6-7 9.9-7 

-6.1-7 3.4-8 

36:58 1:41:20 

gka5f 

501;374250 

501 

17664 76164 

9.9-7 9.9-7 

-1.3-7 1.1-8 

43:45|2:05:14 

QSDP-BIQ (group 1) 
bqp500-2 501;374250 

501 

19053|71380 

9.9-7|9.9-7 

-1.2-7|l.l-8 

1:02:31|1:52:02 

bqp500-4 

501;374250 

501 

13905 67865 

9.9-7 9.9-7 

-8.9-7 7.8-8 

43:17|1:46:07 

bqp500-6 

501;374250 

501 

17211|62562 

9.9-7 9.9-7 

-2.0-7 6.9-8 

56:23 1:37:19 

bqp500-8 

501;374250 

501 

19742 85057 

9.9-7 9.9-7 

-4.9-7 7.0-8 

1:05:09|2:15:52 

bqp500-10 

501;374250 

501 

17690 65484 

9.9-7 9.9-7 

-2.3-7 6.7-8 

58:00|1:43:04 

gkalf 

501;374250 

501 

8919|55669 

9.9-7 9.9-7 

-8.8-7 4.1-8 

26:42|1:25:01 

gka2f 

501;374250 

501 

13587|61324 

9.9-7 9.9-7 

-4.5-7 2.1-8 

42:50 1:37:15 

gkaSf 

501;374250 

501 

13786 62438 

9.9-7 9.9-7 

-2.2-7 3.1-8 

42:55 1:37:29 

gka4f 

501;374250 

501 

13953 57164 

9.6-7 9.9-7 

-7.2-7 -3.4-8 

44:25|1:31:14 

gka5f 

501;374250 

501 

15968 62001 

9.9-7 9.9-7 

-1.4-7 4.6-8 

50:22 1:35:40 

QSDP-BIQ (group 2) 
bqp500-2 501;374250 

501 

16506179086 

9.9-7|9.9-7 

-1.2-7|4.2-8 

52:46|1:52:08 

bqp500-4 

501;374250 

501 

8675|30677 

9.9-7 9.9-7 

2.7-8|2.3-8 

25:32|41:15 

bqp500-6 

501;374250 

501 

10043|42654 

9.9-7 9.9-7 

-3.0-8|8.3-8 

29:46 58:58 

bqp500-8 

501;374250 

501 

9410|43785 

9.9-7 9.9-7 

-2.5-8 2.9-8 

27:37 59:05 

bqp500-10 

501;374250 

501 

10656|35213 

9.9-7 9.9-7 

-3.6-8 8.8-8 

32:35 47:00 

gkalf 

501;374250 

501 

10939 52226 

9.9-7 9.9-7 

-5.8-8 3.8-8 

36:10 1:16:48 

gka2f 

501;374250 

501 

7757|34660 

9.9-7 9.9-7 

-1.8-8 6.0-8 

25:17|48:40 

gka3f 

501;374250 

501 

11241|45857 

9.9-7 9.9-7 

-1.2-8 2.7-8 

34:55 1:02:59 

gka4f 

501;374250 

501 

11706 37466 

9.9-7 9.9-7 

-3.7-8 6.4-8 

36:19 51:25 

gka5f 

501;374250 

501 

14229|48670 

9.9-7 9.9-7 

-4.8-8 9.8-8 

42:37 1:06:37 

QSDP-BIQ (group 3) 
bqp500-2 501;374250 

501 

18311|66867 

9.9-7|9.9-7 

-1.9-7|1.2-7 

41:33|1:11:30 

bqp500-4 

501;374250 

501 

14169 65580 

9.9-7 9.9-7 

-7.8-7 1.1-7 

30:04 1:10:29 

bqp500-6 

501;374250 

501 

16428 68301 

9.9-7 9.9-7 

-2.3-7 8.4-8 

36:25 1:13:20 

bqp500-S 

501;374250 

501 

26308 107664 

9.9-7 9.9-7 

-4.0-7 9.5-9 

1:01:17|2:00:06 

bqp500-10 

501;374250 

501 

16398 57221 

9.9-7 9.9-7 

-2.8-7 8.6-8 

37:22|1:06:27 

gkalf 

501;374250 

501 

14479|51294 

9.9-7 9.9-7 

-3.6-7 7.0-8 

31:05 59:17 

gka2f 

501;374250 

501 

9365|60799 

9.9-7 9.9-7 

-1.5-6 -1.9-9 

18:30|1:04:14 

gka3f 

501;374250 

501 

14175|57782 

9.9-7 9.9-7 

-3.2-7 2.0-8 

30:10 1:01:35 

gka4f 

501;374250 

501 

13356 56588 

9.8-7 9.9-7 

-5.8-7 -2.0-8 

27:42|1:00:10 

gka5f 

501;374250 

501 

14122|58716 

9.9-7 9.9-7 

-1.4-7 9.3-8 

29:38 1:01:13 


In the table, “sGS-isP” stands for the sGS-isPADMM and “sP-d” stands for the sPADMM-4d and sPADMM-5d 
collectively. The computation time is in the format of “hours:minutes:seconds” 


is that it only needs one cycle of the inexact sGS method, instead of an unknown number of 
cycles, to solve each of the subproblems involved. Our preliminary numerical results for solving 
320 high-dimensional linear and convex quadratic SDP problems with bound constraints, as 
well as with a large number of linear equality and inequality constraints have shown that for the 
vast majority of the tested problems, the proposed the sGS-imsPADMM is 2 to 3 times faster 
than the directly extended multi-block PADMM (with no convergence guarantee) even with 
the aggressive step-length of 1.618. This is a striking surprise given the fact that although the 
latter’s convergence is not guaranteed, it is currently the benchmark among first-order methods 
targeting to solve multi-block linear and quadratic SDPs to modest accuracy. Our results clearly 
demonstrate that one does not need to sacrifice speed in exchange for convergence guarantee 
in developing ADMM-type first order methods, at least for solving high-dimensional linear and 
convex quadratic SDP problems to moderate accuracy. As mentioned in the Introduction, our 
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goal of designing the sGS-imsPADMM is to obtain a good initial point to warm-start the aug¬ 
mented Lagrangian method so as to quickly benefit from its fast local linear convergence. So 
the next important step is to see how the sGS-imsPADMM can be exploited to produce effi¬ 
cient solvers for solving high-dimensional convex composite conic optimization problems to high 
accuracy. We leave this as our future research topic. 


Acknowledgements The authors would like to thank Xudong Li and Liuqin Yang at National University of 
Singapore for suggestions on the numerical implementations of the algorithms described in this paper, and Caihua 
Chen at Nanjing University and Ying Cui at National University of Singapore for discussions on the non-ergodic 
iteration complexity of first-order methods. 


References 

1. Chen, C., He, B., Ye, Y. and Yuan, X.: The direct extension of ADMM for multi-block convex minimization 
problems is not necessarily convergent. Math. Program. 155, 57—79 (2016) 

2. Chen, L., Sun, D.F. and Toh, K.-C.: A note on the convergence of ADMM for linearly constrained convex 
optimization problems. arXiv: 1507.02051 (2015) 

3. Clarke, F.H.: Optimization and Nonsmooth Analysis. Wiley, New York (1983) 

4. Cui, Y., Li, X.D., Sun, D.F. and Toh, K.-C.: On the convergence properties of a majorized ADMM for linearly 
constrained convex optimization problems with coupled objective functions. J. Optim. Theory Appl., in print 
(2016) 

5. Davis, D. and Yin, W.: Convergence rate analysis of several splitting schemes. arXiv: 1406.4834 (2014) 

6. Eckstein, J.: Some saddle-function splitting methods for convex programming. Optim. Methods Softw. 4(1), 
75-83 (1984) 

7. Eckstein, J. and Yao, W.: Understanding the convergence of the alternating direction method of multipliers: 
theoretical and computational perspectives. Pac. J. Optim. 11(4), 619-644 (2015) 

8. Eckstein, J. and Bertsekas, D.P.: On the Douglas-Rachford splitting method and the proximal point algorithm 
for maximal monotone operators. Math. Program. 55(1-3), 293—318 (1992) 

9. Fazel, M., Pong, T.K., Sun D.F. and Tseng, P.: Hankel matrix rank minimization with applications to system 
identification and realization. SIAM J. Matrix Anal. 34(3), 946-977 (2013) 

10. Fortin, M. and Glowinski, R.: Augmented Lagrangian Methods: Applications to the Numerical Solution of 
Boundary-Value Problems. Elsevier (1983) 

11. Gabay, D. and Mercier, B.: A dual algorithm for the solution of nonlinear variational problems via finite 
element approximation. Comput. Math. Appl. 2(1), 17-40 (1976) 

12. Glowinski, R.: Lectures on Numerical Methods for Non-Linear Variational Problems. Published for the Tata 
Institute of Fundamental Research, Bombay. Springer-Verlag (1980) 

13. Glowinski, R. and Marroco, A.: Sur Tapproximation, par elements finis d’ordre un, et la resolution, par 
penalisation-dualite d’une classe de problemes de Dirichlet non lineaires. Revue frangaise d’atomatique, 
Informatique Recherche Operationelle. Analyse Numerique 9(2), 41—76 (1975) 

14. He, B., Liao, L., Han D. and Yang, H.: A new inexact alternating directions method for monotone variational 
inequalities. Math. Program. 92(1), 103-118 (2002) 

15. Hestenes, M.: Multiplier and gradient methods. J. Optim. Theory Appl. 4(5), 303-320 (1969) 

16. Lemarechal, C. and Sagastizabal, C.: Practical aspects of the Moreau-Yosida regularization: theoretical 
preliminaries. SIAM J. Optim. 7(2), 367-385 (1997) 

17. Li, M., Liao, L. and Yuan, X.: Inexact alternating direction methods of multipliers with logarithmic-quadratic 
proximal regularization. J. Optim. Theory Appl. 159(2), 412-436 (2013) 

18. Li, M., Sun, D.F. and Toh, K.-C.: A majorized ADMM with indefinite proximal terms for linearly constrained 
convex composite optimization. SIAM J. Optim., in print (2016) 

19. Li, X.D.: A Two-Phase Augmented Lagrangian Method for Convex Composite Quadratic Programming. 
PhD thesis. Department of Mathematics, National University of Singapore (2015) 

20. Li, X.D. Sun, D.F. and Toh, K.-C.: A Schur complement based semi-proximal ADMM for convex quadratic 
conic programming and extensions. Math. Program. 155, 333-373 (2016) 

21. Li, X.D., Sun, D.F. and Toh. K.-C.: QSDPNAL: A two-phase Newton-CG proximal augmented Lagrangian 
method for convex quadratic semidefinite programming problems, preprint (2015) Available at http://www. 
math.nus.edu.sg/~mattohkc/papers/QSDPNAL.pdf 

22. Ng, M.K., Wang F. and Yuan. X.: Inexact alternating direction methods for image recovery. SIAM J. Sci. 
Comput. 33(4), 1643-1668 (2011) 

23. Powell, M.: A method for nonlinear constraints in minimization problems. In: Fletcher, R. (ed.) Optimization, 
pp. 283-298. Academic, New York (1969) 

24. Rockafellar, R.T.: Convex Analysis. Princeton University Press (1970) 

25. Rockafellar, R.T.: Augmented Lagrangians and applications of the proximal point algorithm in convex pro¬ 
gramming. Math. Oper. Res. 1(2), 97-116 (1976) 

26. Rockafellar, R.T.: Monotone operators and the proximal point algorithm. SIAM J. Control Optim. 14(5), 
877-898 (1976) 



Inexact sGS based Majorized ADMM 


25 


27. Solodov, M.V. and Svaiter, B.F.: A hybrid approximate extragradient-proximal point algorithm using the 
enlargement of a maximal monotone operator. Set-Valued Anal. 7(4), 323—345 (1999) 

28. Solodov, M.V and Svaiter, B.F.: A hybrid projection-proximal point algorithm. J. Convex Anal. 6(1), 59—70 
(1999) 

29. Solodov, M.V. and Svaiter, B.F.: An inexact hybrid generalized proximal point algorithm and some new 
results on the theory of Bregman functions. Math. Oper. Res. 25(2), 214-230 (2000) 

30. Sun, D.F., Toh, K.-C. and Yang, L.Q.: A convergent proximal alternating direction method of multipliers 
for conic programming with 4-block constraints. SIAM J. Optim. 25(2), 882—915 (2015) 

31. Sun, D.F., Toh, K.-C. and Yang, L.Q.: An efficient inexact ABCD method for least squares semidefinite 
programming. SIAM J. Optim., in print (2016) 

32. Todd. M.J.: A study of search directions in primal-dual interior-point methods for semidefinite programming. 
Optim. Methods Softw. 11, 1-46 (1999) 

33. Todd, M.J., Toh, K.-C. and Tutuncu, R.H.: On the Nesterov-Todd direction in semidefinite programming. 
SIAM J. Optim. 8, 769-796 (1998) 

34. Toh, K.-C.: An inexact primal-dual path-following algorithm for convex quadratic SDP. Math Program. 
112(1), 221-254 (2008) 



